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Biological systems typically involve large numbers of components with complex, highly parallel 
interactions and intrinsic stochasticity. To model this complexity, numerous programming languages 
based on process calculi have been developed, many of which are expressive enough to generate 
unbounded numbers of molecular species and reactions. As a result of this expressiveness, such 
calculi cannot rely on standard reaction-based simulation methods, which require fixed numbers of 
species and reactions. Rather than implementing custom stochastic simulation algorithms for each 
process calculus, we propose to use a generic abstract machine that can be instantiated to a range of 
process calculi and a range of reaction-based simulation algorithms. The abstract machine functions 
as a just-in-time compiler, which dynamically updates the set of possible reactions and chooses the 
next reaction in an iterative cycle. In this short paper we give a brief summary of the generic abstract 
machine, and show how it can be instantiated with the stochastic simulation algorithm known as 
Gillespie's Direct Method. We also discuss the wider implications of such an abstract machine, and 
outline how it can be used to simulate multiple calculi simultaneously within a common framework. 

1 Introduction 

Biological systems typically involve large numbers of components with complex, highly parallel in- 
teractions and intrinsic stochasticity. To model this complexity, numerous programming languages 
based on process calculi have been developed, many of which are expressive enough to generate un- 
bounded numbers of molecular species and reactions. Examples include variants of the stochastic pi- 
calculus HHJUUITTl, BlenX 13J, the kappa calculus |2], and variants of the bioambient calculus lfl31[T0l . 
As a result of this expressiveness, such calculi cannot rely on standard reaction-based simulation meth- 
ods such as 013, which require fixed numbers of species and reactions. Instead, a custom simulation 
algorithm is typically developed for each calculus. The choice of algorithm depends on the nature of 
the underlying biological system, such as whether exact simulation is required 015]], whether certain 
reactions operate at different timescales Q [161 . or whether non-Markovian reaction rates are needed 



Rather than implementing custom stochastic simulation algorithms for each process calculus, we 
propose to use a generic abstract machine that can be instantiated to a range of process calculi and a 
range of reaction-based simulation algorithms. The abstract machine functions as a just-in-time compiler, 
which dynamically updates the set of possible reactions and chooses the next reaction in an iterative 
cycle. The abstract machine is instantiated to a particular calculus by defining two functions: one for 
transforming a process of the calculus to a set of species, and another for computing the set of possible 
reactions between species. The abstract machine is instantiated to a particular simulation algorithm by 
definition three functions: one for computing the next reaction, one for computing the reaction activity 
from an initial set of reactions and species populations, and a third for updating the reaction activity as the 
species populations change over time. Having a clear separation between the simulation algorithm and 
the language specification allows us not only to easily instantiate the machine to different process calculi, 
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Table 1: Syntax of the generic abstract machine, where a term T consists of the current time t, a species 
map S and a reaction map R. We let 7 denote a multiset of species {7i, ..,/#} an d O denote a set of 
reactions. 



T 


syntax 




description 


T 


(t,S,R) 




Time t, species map S, reaction map R 


S 


{h i-> h,-- 


,In i-> in} 


Map from a species 7 to its population i, 


R 




-,0 N ^A N } 


Map from a reaction to its activity A 









Reaction with reactants 7, products /' and rate r. 



Table 2: Parameterised definition of the generic abstract machine. If 7 is a multiset {7i, ..,%} we write 
7® 7 for /i © ..©/#© 7\ and T Ql for T QhQ ..Ql N (the order is unimportant). We write dom{S) for 
the domain of 5. We also write S(I) for the value associated with 7 in S, and S{I h-> v} for 5 updated so 
that v is associated with 7. 



function definition 

Per = species{P)@T ^ 
W(t,S,R) = (t,S',RUR') iff = dom(S); I gT';S' = S{Ii-+ 1}; 

5 = reactions (I, I'); R' = init(0, (t,S',R)) 
/© (t,S,R) = (t,S',RUR') if 5(7) = z; 5' = 5{7 ^ i+ 1}; /?' = updates(I, (t,S',R)) 
(t,S,R) ©7 = (t,S',RUR') if 5(7) = /; 5' = 5{7 ^ /- 1}; 7?' = updates(I, (t,S',R)) 



but also to add new simulation algorithms that can be shared between calculi. Furthermore, the approach 
could be used to dynamically integrate the simulation of multiple process calculi simultaneously, acting 
as a common language runtime for the simulation of process calculi for biology. 

In this short paper we give a brief summary of the generic abstract machine of flU, and show how it 
can be instantiated with the stochastic simulation algorithm of (5). We also discuss the wider implications 
of such an abstract machine, and outline how it can be used to simulate multiple calculi simultaneously 
within a common framework. 

2 Summary of the Abstract Machine 

The syntax of the generic abstract machine is summarised in Table [T] and is based on the definitions of 
|8l . A machine term T is a triple (t,S,R), where t is the current time, S is a map from a species 7 to 
its population i, and 7? is a map from a reaction O to its activity A, which is used to compute the next 
reaction. Each reaction is represented by a tuple (7, r,I'), where 7 denotes the multiset of reactant species, 
7' denotes the multiset of product species and r denotes the reaction rate. The structure of a term of the 
abstract machine can be summarised as follows. 



Machine term T 


Time t 


Species map 5 


Reaction map R 


Species 


Population 


Reaction 


Activity 


7i 


h 




Ai 










In 


iff 


T rM y 7' 

IM > lf4 


A M 
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Table 3: Instantiation of the generic abstract machine with the stochastic simulation algorithm of O. We 
write {Ei | C\ ; ..; C/v} to denote the set of elements E{ that satisfy conditions C\ ; . . ; Cat. We let n i and «2 de- 
note two random numbers from the standard uniform distribution, U (0, 1). The function propensity (0,S) 



is defined in the main text. 


function 


definition 


next(t,S,R) = 

init(0, (t,S,R)) = 
updates (I, (t,S,R)) = 


o^t+t' if a = Zoi&dom{R) R {°i)> *' = EfcTi 1 a; < " 2a ° - Eti a <' 

{0/ h-> propensity (Oi,S) Oj £ 0} 

{Oi>-^ propensity (Oi,S) Oj € dom(R);Oi = (J,r,J');I G 7} 



To instantiate the abstract machine with a given process calculus, it is sufficient to define a function 
species(P) for transforming a process P to a multiset of species, together with a function reactions(I,P) 
for computing the set of reactions between a new species I and an existing set of species /'. The syntax 
of species I is specific to the choice of process calculus. The species function is used to initialise the 
abstract machine at the beginning of a simulation, while the reactions function is used to update the set 
of possible reactions dynamically. This allows systems with potentially unbounded numbers of species 
and reactions to be simulated. 

To instantiate the abstract machine with a given simulation algorithm, it is sufficient to define a 
function next(T) for choosing the next reaction from a term T, a function init(0,T) for initialising a 
term with a set of reactions O, and a function updates(I, T) for updating the reactions in a term affected 
by a given species I. The abstract machine is then executed by repeated application of the following rule. 

(T,r,P),t' = next(t,S,R) 
t,S,R (I -^f(B((t>,S,R)eT) 

Each time the next reaction is selected, it is executed by removing the reactants I from the machine 
term, adding the products /' and updating the current time of the machine. Corresponding definitions 
for adding and removing species are summarised in Table [2j A process P is added to a machine term T 
by computing the multiset of species {7i , . . . ,7^} which correspond to P and then adding each of these 
species to the term. If a new species I is already present in the term then its population is incremented 
in S and the activity of the affected reactions is updated. If the species is not already present in the term, 
its population is set to 1 in S and new reactions for the species are computed, together with their activity. 
The operation T Ql removes the species I from the machine term T, by decrementing the corresponding 
species populations and by updating the affected reactions. 



3 Instantiating the Abstract Machine 

An instantiation of the abstract machine with the stochastic simulation algorithm of [6] is outlined in 
Table [i] Each reaction (I,r,P) is mapped to its propensity a,-, which is computed by multiplying the rate 
of the reaction by the number of distinct combinations of the reactants I. The function propensity (0,S) 
computes the propensity of the reaction O given the species map S and is defined as follows, assuming 
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that reactions are either unary or binary and that I\ and I 2 are distinct species. 



propensity (({Ii},r, I') , S) 
propensity (({Ii,h},r,I'),S) 
propensity (({h,I 2 },r,I'),S) 



A 



rxS(h) 

rxS(h)x(S{h)-\)/2 
rxS{h)xS(I 2 ) 



A 



A 



The function init(0,T) computes the initial propensity for each reaction in O, using the initial species 
populations in T, while the function updates (I, T) updates the propensities of all the reactions in T for 
which / is a reactant. Finally, the function next{T) chooses a reaction from T with probability propor- 
tional to the reaction propensity, and computes the corresponding duration of the reaction according to 



We have also instantiated the abstract machine to the Next Reaction Method of [ 5 ] and to the Non- 
Markovian Next Reaction Method of [8], by defining corresponding wit, next and updates functions, 
as described in [8]. We have used the abstract machine to implement the DNA Strand Displacement 
(DSD) calculus for modelling DNA circuits [12], the Genetic Engineering of Cells (GEC) calculus for 
modelling of genetic devices 0, and the Stochastic Pi Machine (SPiM) calculus for general model- 
ling of biological systems iTTTl . by defining appropriate species and reactions functions for each calcu- 
lus. Simulators for these three calculi are available online at http : //research . microsoft . com/dna, 
http : //research, microsoft . com/gee and http : //research, microsoft . com/spim, respectively. 
Technical details of the instantiation of the generic abstract machine with the stochastic pi-calculus and 
the bioambient calculus are outlined in |Q. We are currently developing an instantiation of the generic 
abstract machine to the kappa calculus of [2 ]. Although the idea of integrating different modelling and 
simulation methods within a common framework is not a new one [4], our approach is the first attempt 
to formally define a generic framework for simulating a broad range of process calculi with an arbitrary 
reaction-based simulation algorithm. 

4 A Common Simulation Framework 

The generic abstract machine can be used to simulate multiple calculi simultaneously by assuming a 
separate species type li for each calculus L, together with an initial set of cross-calculus reactions Oo. An 
example of a cross-calculus reaction is Idsd ~\~ IsPiM —> hpiM + I'spuf which takes a species of the DSD 
language, such as a known DNA vaccine assembled via strand displacement, together with a species of 
the SPiM language, such as a polymerase, and produces a corresponding protein species in SPiM together 
with the original polymerase. The reaction therefore enables the output of a strand displacement model 
in DSD to interface with a cellular model in SPiM. For each dynamically created species li the function 
reactions^, I') calls the appropriate calculus-specific function reactions where I' L denotes the 
subset of species in /' that are of type L. This approach allows multiple calculi to interact with each other 
within the same simulation environment, via a fixed set of interface reactions. Further work is needed to 
formalise the multi-language execution paradigm in more detail. 

The generic abstract machine can therefore be used to simulate a range of existing process calculi 
within a common framework. By decoupling the choice of calculus from the choice of simulation al- 
gorithm, multiple calculi can re-use the same algorithm via a common interface, without the need to 
implement custom simulation algorithms for each calculus. In future, this could allow models to be 
constructed from components written in different domain-specific languages, each designed to allow a 
natural, concise encoding of that component. The components could then interact dynamically via a 
common language runtime, allowing integrated simulation of heterogeneous biological systems. 
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